Spoken Document Retrieval Leveraging Unsupervised and Supervised Topic Modeling Techniques
نویسندگان
چکیده
منابع مشابه
Spoken Document Retrieval Leveraging Unsupervised and Supervised Topic Modeling Techniques
This paper describes the application of two attractive categories of topic modeling techniques to the problem of spoken document retrieval (SDR), viz. document topic model (DTM) and word topic model (WTM). Apart from using the conventional unsupervised training strategy, we explore a supervised training strategy for estimating these topic models, imagining a scenario that user query logs along ...
متن کاملLeveraging Relevance Cues for Improved Spoken Document Retrieval
Spoken document retrieval (SDR) has emerged as an active area of research in the speech processing community. The fundamental problems facing SDR are generally three-fold: 1) a query is often only a vague expression of an underlying information need, 2) there probably would be word usage mismatch between a query and a spoken document even if they are topically related to each other, and 3) the ...
متن کاملSupervised and Unsupervised Web Document Filtering Techniques to Improve Text-Based Music Retrieval
We aim at improving a text-based music search engine by applying different techniques to exclude misleading information from the indexing process. The idea of the original approach is to index music pieces by “contextual” information, more precisely, by all texts to be found on Web pages retrieved via a common Web search engine. This representation allows for issuing arbitrary textual queries t...
متن کاملA Comparative Study of Methods for Topic Modeling in Spoken Document Retrieval
Topic modeling for information retrieval (IR) has attracted significant attention and demonstrated good performance in a wide variety of tasks over the years. In this paper, we first present a comprehensive comparison of various topic modeling approaches, including the so-called document topic models (DTM) and word topic models (WTM), for Chinese spoken document retrieval (SDR). Moreover, diffe...
متن کاملUnsupervised Topic Modeling Approaches to Decision Summarization in Spoken Meetings
We present a token-level decision summarization framework that utilizes the latent topic structures of utterances to identify “summaryworthy” words. Concretely, a series of unsupervised topic models is explored and experimental results show that fine-grained topic models, which discover topics at the utterance-level rather than the document-level, can better identify the gist of the decisionmak...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEICE Transactions on Information and Systems
سال: 2012
ISSN: 0916-8532,1745-1361
DOI: 10.1587/transinf.e95.d.1195